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ABSTRACT 


The  optimal  linear  combination  of  control  variates  is  well  known  when 
the  controls  are  assumed  to  be  unbiased.  We  derive  here  the  optimal  linear 
combination  of  controls  in  the  situation  where  bias  is  present.  This 
analysis  is  particularly  relevant  to  the  small-sample  theory  for  control 
variates  as  applied  to  the  steady-state  estimation  problem.  Results  for 
the  method  of  multiple  estimates  are  also  given. 

Keywords:  control  variables 

simulation  output  analysis, 
small-sample  theory 
steady-state  estimation  problem 
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1.  INTRODUCTION 


The  method  of  control  variates  has  been  extensively  studied  as  a  tech¬ 
nique  for  obtaining  variance  reductions  for  complex  simulations.  The 
method  basically  requires  that  the  practitioner  be  able  to  identify  pro¬ 
cesses  for  which  the  asymptotic  mean  is  known;  the  knowledge  of  those 
asymptotic  means  is  then  used  to  obtain  a  variance  reduction. 

Our  goal  here  is  to  study  a  specific  aspect  of  the  small-sample  theory 
for  control  variates.  Our  particular  interest  focuses  on  the  loss  of  effi¬ 
ciency  incurred  when  only  the  asymptotic  mean  is  known,  as  opposed  to  the 
true  (small-sample)  mean.  The  results  obtained  here  have  implications  for 
the  application  of  control  variates  to  the  steady-state  estimation  prob¬ 
lem.  Specifically,  in  many  steady-state  simulations,  only  the  asymptotic 
means  of  the  control  variates  are  known;  see,  for  example,  Section  8  of 
GLYNN  and  WHITT  (1985),  in  which  the  arrival  process  to  a  queue  is  used  as 
a  control. 

The  results  obtained  here  complement  other  small-sample  studies  on 
control  variates  in  which  the  focus  is  on  the  degradation  in  performance 
caused  by  estimation  of  the  optimal  control  coefficients;  see,  for  example 
LAVENBERG,  MOELLER,  and  WELCH  (1982),  RUBINSTEIN  and  MARCUS  (1985),  and 
VENKATRAMAN  and  WILSON  (1985). 

Our  methods  can  also  be  used  to  study  small-sample  properties  of  the 
method  of  multiple  estimates;  see  Section  5.  Concluding  remarks  are  stated 


in  Section  6. 


2.  BACKGROUND  ON  CONTROL  VARIATES. 

Suppose  that  one  wishes  to  estimate  a  parameter  r  from  a  simulation. 
Assume  that  it  is  possible  to  generate  variables  (Xj ,Y  ) , 

(X^  E  ,  Yi  t  ,  d  >  1)  such  that 


(2.1) 


—  1  r  v 

X  5  —  >  X,  =4  r 
n  n  ,  i 


Y  =  — ■  l  Y  =»  U 
n  n  i 


denotes  weak  convergence)  as  n-**>,  where  n  is  known.  Clearly,  the 


estimator  X  is  consistent  for  r  under  (2.1). 
n 

The  fundamental  observation  underlying  the  method  of  control  variates 
is  that  (2.1)  and  (2.2)  together  imply  that  for  \  e  ]Rd , 


U  (X)  s  X  -  XC(  Y  -  p)  r 
n  n  n 


as  n-**>,  so  u  (X)  is  also  consistent  for  r.  (We  adopt  here  the  conven- 
n 

tion  that  all  elements  of  are  represented  as  column  vectors;  aC 


denotes  the  transpose  of  a  t  R  ,)  Since  u  is  known,  U  (  X)  is  an 

n 

estimator  which  can  legitimately  be  constructed  from  the  simulated  data. 

Furthermore,  \  is  at  our  disposal,  so  that  X  may  be  chosen  so  as  to 

maximize  the  efficiency  of  the  estimator  IMX). 

To  maximize  the  asymptotic  efficiencv  of  U  (X),  it  is  common  to  assume 

n 

a  strengthened  form  of  (2.1)  and  (2.2) 


(2.3) 


n  (X  -  r,  Y  -  u)  ^  N(0,C) 


as  n-*® ,  where  N(0,C)  is  a  multivariate  normal  distribution  with  mean 
vector  zero  and  (d+L)  *  (d+1)  covariance  matrix 


a2  c  C 
x  xy 

c  c 
xy  yy 


(c  and  c^  are  dxl  and  dxd  matrices,  respectively.)  Given  (2.3), 
the  continuous  mappling  lemma  (BILLINGSLEY  (1968),  p.  31)  shows  that 


(2.4) 


n2(U  (X)  -  r)  cr2(X)  N(0,1) 
n 


as  tr*® ,  where 


2,.  x  2  .  t  t.  x  .  t  . 

a  (X)  »o  -  Xc  -c  X+Xc  X. 
x  xy  xy  yv 


To  optimize  the  asymptotic  efficiency  of  U  (X),  (2.4)  suggests  that 

2 

one  should  choose  X  so  as  to  minimize  a  (X).  Assuming  that  c  is 

yy 

positive  definite  (and  hence  non-singular),  the  value  X*  which  minimizes 


o  (X)  is  given  by 


(2.5) 


X  *  »  c  1  c 

yy  xy 


(see  p.  31)  of  ANDERSON  (1958);  the  corresponding  value  of  a  (X)  is  then 


(2.6) 


20  .  .  2  t  -1 

o  (X*)  =  a  -  c  c  c 

x  xy  yy  xy 


3.  THE  OPTIMAL  SOLUTION  IN  THE  PRESENCE  OF  BIAS 


The  development  of  the  formulas  (2.5)  and  2.6)  discussed  in  Section  2 

relied  heavily  on  the  asymptotic  limit  theory  for  the  estimator  U  (X).  A 

n 

somewhat  different  approach,  which  permits  study  of  small-sample  behavior, 

can  also  be  taken.  As  we  shall  see,  the  two  viewpoints  coincide,  under 

appropriate  regularity  conditions,  in  the  limit. 

A  reasonable  criterion  for  choosing  the  control  coefficient  vector  X 

is  to  choose  X  so  that  the  mean  squre  error  (MSE)  of  U  ( X)  is  minimized 

n 

For  this  criterion  to  make  sense,  assume  that: 


(3.1) 

Let 


and 


E(X2  +  YCY  )  <  «  for  n  >  1. 
n  n  n  — 


b  (n)  -  E( X  -  r)  , 
x  n 


b  (n)  =  E(Y  -  p)  , 

y  n 


var  (Xn)  , 


c  (n)  =  E(X  Y  )  -  E( X  )E(Y  )  , 

xy  n  n  n  n 


c  (n)  =  E(YtY  )  -  E(Y  )CE(Y  )  , 

yy  n  n  n  n 


MSE  (X)  =  E(U  (X)  -  r) 


2 


Then , 


(3.2)  MSE  (X)  -  var(U  (X))  +  (E(U  (X)  -  r))2 

n  n  n 

*  <J2(n)  -  \Cc  (n)  -  c  (n)C\ 
x  xy  xy 

+  \Cc  (n)X  +  b2(n) 
xy  x 

+  b  (n)\Cb  (n)  +  b  (n)b  (n)C\ 
x  y  x  y 

+  \Cb  (n)b  (n)C\ 

y  y 

2 

This  quadratic  form  in  X  has  precisely  the  same  structure  as  does  a  (X), 

* 

so  that  the  minimizer  X  of  MSE  (X)  is  given  by 

n  n 

(3.3)  X*  *  A(n)_1d(n) 

n 

where 


A(n)  =  c  (n)  +  b  (n)b  (n)C 

yy  y  y 

d(n)  *  c  (n)  +  b  (n)b  (n). 
xy  x  y 

(Of  course,  we  again  require  that  c^(n)  be  positive  definite.  Note  that 

this  implies  A(n)  is  positive  definite.)  The  corresponding  minimal  value 

of  MSE  (X)  is  then 
n 


5 


(3.4) 


MSE  (X)*a(n)+b(n) 
n  n  x  x 

-  d(n)tA(n)'1d(n). 


We  summarize  our  discussion  thus  far  with  the  following  proposition. 

(3.4)  PROPOSITION.  Assume  (3.1).  Then,  if  c  (n)  is  positive  definite, 

yy 

* 

the  minimizer  X  of  MSE  (X)  is  given  by  (3.3)  and  the  minimizing  value 
n  n 

of  MSE  (X)  is  given  by  (3.4). 


As  promised  earlier,  we  will  now  show  that  the  MSE  criterion  used  in 
this  section  coincides  with  the  asymptotic  efficiency  criterion  used  in 
Section  2.  We  will  require  the  additional  regularity  condition: 


(3.6)  {n(X  -  r)^  +  n(Y  -  u)t(Y  -  u)  :  n  >  1 }  is  uniformly  integrable 

n  n  n  — 

This  assumption  allows  us  to  pass  expectations  through  the  limit  theorem 
(2.3),  thereby  yielding 

I  i 

(3.7)  n b^( n)  ■*  0  , 


1 

2 

n  b^(n)  +  0 

2  2 

n  a  (n)  +  o 
x  x 


n  c  (n) 

+  c 

xy 

xy 

n  c  ( n ) 

+  c 

yy 

yy 

as  n><«>.  We  mav  therefore  conclude  tha 


n  A(n)  ♦  c  ,  and 

yy 

n  d(n)  -*■  c 

xy 


So  chat  if  c  is  positive  definite,  then 

yy 


=  (n  A(n ) )  *(n  d(n)) 


-1  * 

■>  c  c  =  X  . 

yy  xy 


Similarly , 


n  MSE  (X*)  +  o2(X*), 
n  n 


thereby  yielding  the  following  result. 

(3.8)  PROPOSITION.  Assume  (2.3)  and  (3.6)  ((3.6)  implies  (3.1)).  If  c 

yy 

•k  k  2 

is  positive  definite,  then  X  and  n  MSE  ( X  )  +  a  (X*),  as  n-*®. 

n  n  n 

This  proposition  is  a  formal  statement  of  the  fact  that  the  MSE  and 
asymptotic  efficiency  criterion  coincide  as  n>®. 

4.  SMALL-SAMPLE  THEORY  FOR  STEADY-STATE  CONTROL  VARIATE  SCHEMES 

Suppose  that  r  is  a  steady-state  parameter  of  a  stochastic  system, 
and  the  ( X j ) , (X  , Y  ) . . .  represent  observations  gathered  during  the 
time  evolution  of  the  system.  Our  primary  goal  here  is  to  obtain  an 

*  k 

asymptotic  expansion  for  X  and  MSE  (X  ). 

n  n  n 

We  will  need  to  assume  that  both  (2.3)  and  (3.6)  are  valid  for  our 


steady-state  simulation.  One  condition  which  guarantees  this  is  to  require 


that  {(X  ,Y  )  :  n  >  1}  be  a  non-delayed  regenerative  sequence  with  regen- 
n  n  — 

eration  times  T  -  0 ,  T^T^,....  If  one  imposes  the  moment  condition 


E  U  I  (X?  +  Y^Y  +  l)}2}  <  -, 
k=0 


then  (2.3)  and  (3.6)  follow  (see  pp.  99-104  of  CHUNG  (1967)  for  a  proof  in 
the  Markov  chain  setting;  the  general  case  can  be  argued  in  precisely  the 
same  way). 

We  will  further  require  that  the  bias  terms  take  the  form 


(4.1) 


bx(n) 


by(n) 


~  b  +  ofy 


—  b  +  o(— ) 
n  y  v  n 


for  some  constants  b  and  b  ,  where  o(d  )  represents  a  sequence  {a  } 

x  y  n  n 

such  that  a  /d  -*  0  as  n-*».  The  assumption  (4.1)  is  satisfied  in  a 
n  n 


variety  of  steady-state  contexts. 
Suppose,  for  example,  that 


(4.2) 


Then,  if  we  set 


EX  -  r  <  «. 
n 


b  ■=  ;  (EX  -  r), 

x  .  n 

n*  1 


it  follows  that 


Assuming  now  chat  (2.3),  (3.6),  and  (4.1)  are  in  force,  observe  tha 


\  =  A(n)  ^  d(n) 

n 


-1 


=  (n  A(n))  (n  d(n)) 


ed  by  (3.6)) 

and 

(4.1), 

it  i 

n  A(  n )  =  c 

1 

+  — 

b  bC  + 

of-h 

yy 

n 

y  y 

'  n 

n  d(n)  *  c 

1 

+  — 

b  b  + 

o(-M 

xy 

n 

x  y 

n 

Hence,  assuming  c  is  a  positive  definite, 

yy 


(n  A(n))  '  =  (c  •  rI  +  —  c  bC  +  of— ) ) 
yy  n  yy  y  y  n 


■  ( I  +  —  c  *b  bC  +  o  (— )  1  lc 

n  yy  y  y  vn  ' '  yy 


Now  for  n  large  enough,  the  matrix 


F(n)  =  —  c  b  b  +  of—  )■ 
n  yy  y  y  vn 


has  been  spectral  radius  less  than  one,  so  that 


(I  +  F( n ) )~ 1  =  I  -  F(n)  +  F(n)2  -  F(n)3  +  ... 


1  -1L  .t  fl, 

=  I - c  bb  +  of— ). 

n  yy  y  y  n 


Consequent  1  v 


(i  — -  c  'b  b£  +  o(— )1  •  c  *  •  (c  +  —  b  b  +  of— 1) 
n  yy  y  y  v  n '  yy  v  xy  n  x  y  n 


-1  1  -1.  .  t  -1 

c  c - c  bbc  c 

yy  xy  n  yy  y  y  yy  xy 


+  —  c  ^b  b  +  of— 1 . 
n  yy  x  y  v  n 


Similarly,  we  find  that 


*2  -1 

n  MSE(X  )  =  a  -  c  c  c 

n  x  xy  yy  xy 


1  Lt  -1  1,2 

-—bbc  c  +  —  b 
n  x  y  yy  xy  n  x 


1  t  —  i 
-  —  c  c  b  b 
n  xy  yy  x  y 


1  t  -1  .  t  -1 

+  —  c  c  bbc  c 
n  xy  yy  y  y  yy  xy 


+  o 


C^- 


We  therefore  obtain  the  following  result. 


(4.5)  THEOREM.  Assume  (2.3),  (3.6),  and  (4.1). 
nite,  then 


If  c 

yy 


X  =  X  -  —  c  bbc  c 
n  n  yy  y  y  yy  xy 


1  -1,  , 

+  —  c  b  b  + 
n  yy  x  y 


is  positive  def 


and 


n  MSE  (X  ) 
n  n 


2,  *  2  t  -1 

a  (X  ) - bbc  c 

n  x  y  yy  xy 


1  c  “lu  .  t  -1 
+  —  c  c  bbc  c 
n  xy  yy  y  y  yy  xy 


+  -  b2  + 
n  x 


It  is  of  some  interest  to  examine  the  degradation  in  MSE  of  the 

control  variate  scheme  when  the  asymptotic  control  vector  X*  is  used, 

* 

rather  than  the  small-sample  optimal  vector  X^. 

Let  M^(X)  =  n  MSE^(X).  It  is  easily  verified,  from  (3.2),  that  for 
arbitrary  X  and  Xq, 


(4.6) 


where 


and 


M  (X)  -  M  (X  )  -  V  M  (X  )  •  (X  -  X  ) 

n  no  no  o 


+  (X  -  X  )  H  (X  -  X  ) 
on  o 


V  Mn(Xo)  =  2n(cyy (n)  +  by(n)by(n)  )  Xq 


-  2nc  (n)  +  2n  b  (n)b  (n) 
xy  x  y 


H  »  n  c  (n)  +  n  b  (n)  b  (n). 
n  yy  y  y 


(This  is  just  a  Taylor  expansion  of  M  (X)  around  X  =  X  .)  Setting  X 

n  o 

*  * 

X*  and  X  *  X  ,  we  observe  that  V  M  (X  )  =  0  so  that  (4.6)  becomes 
on  n  n 


(4.7)  M  (X*)  =  M  (X*)  +  (X*  -  X*)t  H  (X*  -  X*). 

n  n  n  n  *■>  n  ' 


Letting 


d 


c  b  b  c  c 

yy  y  y  yy  xy 


c  ^  b  b 
yy  x  y 
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* 

Theorem  4.5  shows  that  X 


c  as  n ■*a>.  It  follows 

yy 


*  X*  -  d/n  +  o(l/n),  whereas  (3.7)  yields 
from  (4.7)  that 


H 

n 


M  ( X* ) 


M_  ( X* )  +  1  dCc  d  +  o(-2). 
n  yy  n 


n  n 


As  a  consequence,  we  obtain  the  following  result. 


(4.8)  PROPOSITION .  Assume  (2.3),  (3.6),  (4.1),  and  that  c  is  positive 

yy 

definite.  Then,  the  degradation  in  MSE  (X)  caused  by  using  X*  rather 

n 

*  -3  t  3 

than  X  is  given  by  n  d  c  d  +  o(l/n  ).  (Since  c  is  positive 

n  yy  yy 

definite,  dfcc  d  is  always  nonnegative.) 


Thus,  the  degradation  in  MSE  is  of  small  order,  since  it  decreases  as 
the  reciprocal  of  the  cube  of  the  sample  size.  However,  in  certain  small- 
sample  situations,  the  degradation  could  be  significant.  In  such  a  situa¬ 
tion,  Theorem  4.5  provides  a  possible  key  to  improving  the  performance  of 
the  control  scheme. 

Let  ^  *  X*  -  d/n.  Noting  that  £  *  X  +  o(l/n),  it  follows  from 

n  n  n 

(4.7)  that  M  (£  )  =  M  (X  )  +  o(l/n^).  Thus,  using  X  as  the  control 
m  n  n  n  n 

★ 

vector  is  "almost"  as  good  as  using  the  optimal  vector  X^. 

Clearly,  in  order  to  obtain  optimal  asymptotic  efficiency  from  the 

control  variate  scheme,  X*  ■  c  *c  must  be  estimated.  Ordinarily, 

yy  xy 

this  will  require  consistent  estimation  of  both  c  and  c  ;  see  IGLEHART 

yy  xy 


and  LEWIS  (1979)  for  details  in  the  regenerative  case.  Thus,  if  the  simu¬ 
lation  can  estimate  the  quantities  b  and  b  ,  £  can  be  obtained  and 

x  y  n 

used  to  improve  performance.  Note  that  even  if  the  estimators  are  not  par¬ 
ticularly  accurate,  their  influence  "washes  out"  fairly  rapidly,  since  ^ 

*  X*  as  n-*’a>.  Thus,  one  should  never  lose  too  much  efficiency,  even  with 
poor  estimators. 

5.  SMALL-SAMPLE  THEORY  FOR  STEADY-STATE  MULTIPLE  ESTIMATE  SCHEMES 

Our  goal  here  is  to  establish  small-sample  results,  analogous  to  those 
obtained  in  Section  4,  for  the  method  of  multiple  estimates.  Given  a 
steady-state  parameter  re®,  suppose  that  one  can  generate  an  Re¬ 
valued  sequence  Z  such  that 

l 

(5.1)  n2  (Zn  -  r  e)  N(0,C) 

as  n-***>,  where  z  ■  (Z  +,#,+Z  )/n,  e  c  IR^  a  vector  consisting  entirely 

n  1  n  J 

of  l's,  and  C  is  a  d  x  d  covariance  matrix.  The  idea  behind  the  method 
of  multiple  estimates  is  that  for  any  vector  a  such  that  a^e  »  1,  (5.1) 
implies  that 

aC  Z  =*  r  ; 

one  now  chooses  a  so  as  to  maximize  efficiency.  HEIDELBERGER  (1980)  ex¬ 
plored  this  technique  in  the  context  of  Markov  chains,  and  showed  how  one 
can  generate  Z  's  with  property  (5.1). 


i  -  jk  wjm 


v  v  w  v  r*  rrr*r*i 


K  -*r  nr  «r  v  .nw; 


It  is  worth  pointing  out  that  the  method  of  multiple  estimates  can  be 
viewed  as  a  special  case  of  control  variates.  Let 


X  -  etZ  /d 
n  n 


Then, 


aC  Z  =  X  -  XC  Y 
n  n  n 


where  Y  =  Z  and  X  *  e/d  -  a.  The  constraint  a^e  “  1  translates  into 
n  n 

choosing  X  so  that  X^e  =  0.  With  the  (X^.Y^j's  defined  in  this  way, 
we  are  again  in  the  setting  of  Section  2  through  4.  Although  it  would  be 
possible  to  derive  all  the  asymptotic  theory  for  multiple  estimates  by  ap¬ 
pealing  to  the  previously  developed  results  for  control  variates,  it  seems 
easier  to  obtain  them  directly. 

Note  that  the  continuous  mapping  lemma,  as  applied  to  (5.1),  yields 


(aC  -  r)  o^( a)  N(0,1) 


2  t  2  t 

where  a  (a)  ■  a  Ca.  The  minimizer  of  a  ( a)  subject  to  a  e  =  1  is  given 

by 


*  ,  t  -1 

a  =  C  e/eC  e, 

provided  that  C  is  positive  definite  (see  p.  60  of  RAO  (1973)).  The  mini- 

2  2  t  -1  -1 

mal  value  of  o  (a)  is  then  given  by  a  (a*)  =  (e  C  e)  .  The  following 


theorem  summarizes  the  situation 


(5.2)  THEOREM.  Assume  Chat  (5.1)  holds  with  C  positive  definite.  Then 


2  t  -1  -1  .... 

(i)  a  (a)  has  minimal  value  (e  C  e)  ,  and  is  minimized  at  a*  * 


C-1e/(etC~1e). 


If,  in  addition,  {n(Z  -  re)C  (Z  -  re)  :  n  >  1}  is  uniformly  integrable 

n  n  — 


and  if  EZ  =  re  +  b/n  +  o( 1/n)  for  some  b  e  1R  ,  then: 
n  * 


t  —  2 

(ii)  MSE  (a)  =  E(a  Z  -  r)  has  minimal  value 
n  n 


(eC(C(n)  +  b(n)b(n)1')  ^e)  *  and  is  minimized  at 


.*  (C(n )  +  b  (n)b(n)t)"1e  , 


n  eC(C(n)  +  b  (n)b(n)C)  *e 


where  C(n)  =  E(Z  ZC)  -  (EZ  )  (EZ  )C  and  b(n)  =  EZ  -  re, 
n  n  n  n  n 


(iii) 


C  *e 


,  -1 

_  f  J  +  1  e  C  bb  C  e 

t„-l  ,  '  n  ,  t„-l 


(e  C  e) 


(e  C  e) 


1  C  lbbCC  le  . 

- - - . - +  o(—  and 

n  .  t  -1  .  n' 

(e  C  e) 


.  *,  1  , .  .  e'c"  bb:c'  e ,  ,1 , 

MSE(V  ’  ttt  (1  * - r-ri — >  +  °<n'  • 

e  C  e  n  e  C  e 


(iv)  MSE( a*) 


*  -3  t  .3 

MSE(a  )  +  n  d^Cd  +  o(n  ),  where 


^etc"1bbtC_1e1  „-l _  c'1bbtC  *e 


,  c  -1  .2 
(e  C  e) 


-)  c" 


e  - 


t  -1 
e  C  e 
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Thus,  the  results  obtained  for  the  method  of  multiple  estimates  are 
qualitatively  similar  to  those  obtained  for  control  variates. 

6.  CONCLUSIONS. 

Using  the  MSE  criterion,  we  have  shown  that  under  rather  general 

* 

conditions,  the  small-sample  optimal  control  coefficients  X  ,  for  steady- 
state  simulations,  differ  from  the  asymptotically  optimal  control  coeffi¬ 
cients  X*  by  a  factor  of  order  n  The  first-order  error  term  involves 
only  the  asymptotic  covariance  structure,  and  the  first-order  bias  terms; 
the  (exact)  small-sample  covariance  structure  plays  no  role,  even  in  the 
case  of  a  non-stationary  steady-state  simulation.  The  loss,  in  MSE  effi¬ 
ciency,  created  by  using  the  asymptotically  optimal  X*,  rather  than  the 

*  -3 

small-sample  optimal  Xn,  is  of  order  n  .  Thus,  the  loss  in  efficiency 
is  of  small  order. 

Similar  results  hold  for  the  method  of  multiple  estimates. 
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